85 research outputs found
Practical and Optimal LSH for Angular Distance
We show the existence of a Locality-Sensitive Hashing (LSH) family for the
angular distance that yields an approximate Near Neighbor Search algorithm with
the asymptotically optimal running time exponent. Unlike earlier algorithms
with this property (e.g., Spherical LSH [Andoni, Indyk, Nguyen, Razenshteyn
2014], [Andoni, Razenshteyn 2015]), our algorithm is also practical, improving
upon the well-studied hyperplane LSH [Charikar, 2002] in practice. We also
introduce a multiprobe version of this algorithm, and conduct experimental
evaluation on real and synthetic data sets.
We complement the above positive results with a fine-grained lower bound for
the quality of any LSH family for angular distance. Our lower bound implies
that the above LSH family exhibits a trade-off between evaluation time and
quality that is close to optimal for a natural class of LSH functions.Comment: 22 pages, an extended abstract is to appear in the proceedings of the
29th Annual Conference on Neural Information Processing Systems (NIPS 2015
Using Minimum Description Length for Process Mining
In the field of process mining, the goal is to automatically extract process models from event logs. Recently, many algorithms have been proposed for this task. For comparing these models, different quality measures have been proposed. Most of these measures, however, have several disadvantages; they are model-dependent, assume that the model that generated the log is known, or need negative examples of event sequences. In this paper we propose a new measure, based on the minimal description length principle, to evaluate the quality of process models that does not have these disadvantages. To illustrate the properties of the new measure we conduct experiments and discuss the trade-off between model complexity and compression. 1
Datawetenschappers TU/e voorspellen groei coronabesmettingen per land
Voorspellingen van infecties en dodelijke slachtoffers (drie dagen vooruit en maximum) als gevolg van het coronavirus in Nederland (als geheel en voor provincies) en 13 andere landen.
De voorspellingen geven een duidelijk beeld hoe de coronapandemie zich ontwikkelt in Nederland en 13 andere landen, en kan overheden en organisaties in de gezondheidszorg helpen bij het treffen van noodzakelijke maatregelen. Daarnaast kan de informatie bijdragen aan een vollediger en accurater beeld bij het publiek en de media over de coronapandemie. De voorspellingen in provincies dragen bij aan het voorspellen van ziekenhuisopnamen
Towards EPC Semantics based on State and Context Jan Mendling Wil van der Aalst
Abstract: The semantics of the OR-join have been discussed for some time, in the context of EPCs, but also in the context of other business process modeling languages like YAWL. In this paper, we show that the existing solutions are not satisfactory from the intuition of the modeler. Furthermore, we present a novel approach towards the definition of EPC semantics based on state and context. The approach uses two types of annotations for arcs. Like in some of the other approaches, arcs are annotated with positive and negative tokens. Moreover, each arc has a context status denoting whether a positive token may still arrive. Using a four-phase approach tokens and statuses are propagated thus yielding a new kind of semantics which overcomes some of the wellknown problems related to OR-joins in EPCs.
- …